125 research outputs found
Discussion of "Frequentist coverage of adaptive nonparametric Bayesian credible sets"
Discussion of "Frequentist coverage of adaptive nonparametric Bayesian
credible sets" by Szab\'o, van der Vaart and van Zanten [arXiv:1310.4489v5].Comment: Published at http://dx.doi.org/10.1214/15-AOS1270E in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Multivariate Gaussian Network Structure Learning
We consider a graphical model where a multivariate normal vector is
associated with each node of the underlying graph and estimate the graphical
structure. We minimize a loss function obtained by regressing the vector at
each node on those at the remaining ones under a group penalty. We show that
the proposed estimator can be computed by a fast convex optimization algorithm.
We show that as the sample size increases, the estimated regression
coefficients and the correct graphical structure are correctly estimated with
probability tending to one. By extensive simulations, we show the superiority
of the proposed method over comparable procedures. We apply the technique on
two real datasets. The first one is to identify gene and protein networks
showing up in cancer cell lines, and the second one is to reveal the
connections among different industries in the US.Comment: 30 pages, 17 figures, 3 table
Bayesian ROC surface estimation under verification bias
The Receiver Operating Characteristic (ROC) surface is a generalization of
ROC curve and is widely used for assessment of the accuracy of diagnostic tests
on three categories. A complication called the verification bias, meaning that
not all subjects have their true disease status verified often occur in real
application of ROC analysis. This is a common problem since the gold standard
test, which is used to generate true disease status, can be invasive and
expensive. In this paper, we will propose a Bayesian approach for estimating
the ROC surface based on continuous data under a semi-parametric trinormality
assumption. Our proposed method often adopted in ROC analysis can also be
extended to situation in the presence of verification bias. We compute the
posterior distribution of the parameters under trinormality assumption by using
a rank-based likelihood. Consistency of the posterior under mild conditions is
also established. We compare our method with the existing methods for
estimating ROC surface and conclude that our method performs well in terms of
accuracy
Kullback Leibler property of kernel mixture priors in Bayesian density estimation
Positivity of the prior probability of Kullback-Leibler neighborhood around
the true density, commonly known as the Kullback-Leibler property, plays a
fundamental role in posterior consistency. A popular prior for Bayesian
estimation is given by a Dirichlet mixture, where the kernels are chosen
depending on the sample space and the class of densities to be estimated. The
Kullback-Leibler property of the Dirichlet mixture prior has been shown for
some special kernels like the normal density or Bernstein polynomial, under
appropriate conditions. In this paper, we obtain easily verifiable sufficient
conditions, under which a prior obtained by mixing a general kernel possesses
the Kullback-Leibler property. We study a wide variety of kernel used in
practice, including the normal, , histogram, gamma, Weibull densities and so
on, and show that the Kullback-Leibler property holds if some easily verifiable
conditions are satisfied at the true density. This gives a catalog of
conditions required for the Kullback-Leibler property, which can be readily
used in applications.Comment: Published in at http://dx.doi.org/10.1214/07-EJS130 the Electronic
Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Efficient Bayesian estimation and uncertainty quantification in ordinary differential equation models
Often the regression function is specified by a system of ordinary
differential equations (ODEs) involving some unknown parameters. Typically
analytical solution of the ODEs is not available, and hence likelihood
evaluation at many parameter values by numerical solution of equations may be
computationally prohibitive. Bhaumik and Ghosal (2015) considered a Bayesian
two-step approach by embedding the model in a larger nonparametric regression
model, where a prior is put through a random series based on B-spline basis
functions. A posterior on the parameter is induced from the regression function
by minimizing an integrated weighted squared distance between the derivative of
the regression function and the derivative suggested by the ODEs. Although this
approach is computationally fast, the Bayes estimator is not asymptotically
efficient. In this paper we suggest a modification of the two-step method by
directly considering the distance between the function in the nonparametric
model and that obtained from a four stage Runge-Kutta (RK4) method. We also
study the asymptotic behavior of the posterior distribution of the parameter
based on an approximate likelihood obtained from an RK4 numerical solution of
the ODEs. We establish a Bernstein-von Mises theorem for both methods which
assures that Bayesian uncertainty quantification matches with the frequentist
one and the Bayes estimator is asymptotically efficient
Adaptive Bayesian density regression for high-dimensional data
Density regression provides a flexible strategy for modeling the distribution
of a response variable given predictors by
letting that the conditional density of given as a completely
unknown function and allowing its shape to change with the value of
. The number of predictors may be very large, possibly much
larger than the number of observations , but the conditional density is
assumed to depend only on a much smaller number of predictors, which are
unknown. In addition to estimation, the goal is also to select the important
predictors which actually affect the true conditional density. We consider a
nonparametric Bayesian approach to density regression by constructing a random
series prior based on tensor products of spline functions. The proposed prior
also incorporates the issue of variable selection. We show that the posterior
distribution of the conditional density contracts adaptively at the truth
nearly at the optimal oracle rate, determined by the unknown sparsity and
smoothness levels, even in the ultra high-dimensional settings where
increases exponentially with . The result is also extended to the
anisotropic case where the degree of smoothness can vary in different
directions, and both random and deterministic predictors are considered. We
also propose a technique to calculate posterior moments of the conditional
density function without requiring Markov chain Monte Carlo methods.Comment: Published at http://dx.doi.org/10.3150/14-BEJ663 in the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Bayesian estimation in differential equation models
Ordinary differential equations (ODEs) are used to model dynamic systems
appearing in engineering, physics, biomedical sciences and many other fields.
These equations contain unknown parameters, say of physical
significance which have to be estimated from the noisy data. Often there is no
closed form analytic solution of the equations and hence we cannot use the
usual non-linear least squares technique to estimate the unknown parameters.
There is a two step approach to solve this problem, where the first step
involves fitting the data nonparametrically. In the second step the parameter
is estimated by minimizing the distance between the nonparametrically estimated
derivative and the derivative suggested by the system of ODEs. The statistical
aspects of this approach have been studied under the frequentist framework. We
consider this two step estimation under the Bayesian framework. The response
variable is allowed to be multidimensional and the true mean function of it is
not assumed to be in the model. We induce a prior on the regression function
using a random series based on the B-spline basis functions. We establish the
Bernstein-von Mises theorem for the posterior distribution of the parameter of
interest. Interestingly, even though the posterior distribution of the
regression function based on splines converges at a rate slower than
, the parameter vector is nevertheless estimated at
rate
Bayesian inference for higher order ordinary differential equation models
Often the regression function appearing in fields like economics,
engineering, biomedical sciences obeys a system of higher order ordinary
differential equations (ODEs). The equations are usually not analytically
solvable. We are interested in inferring on the unknown parameters appearing in
the equations. Significant amount of work has been done on parameter estimation
in first order ODE models. Bhaumik and Ghosal (2014a) considered a two-step
Bayesian approach by putting a finite random series prior on the regression
function using B-spline basis. The posterior distribution of the parameter
vector is induced from that of the regression function. Although this approach
is computationally fast, the Bayes estimator is not asymptotically efficient.
Bhaumik and Ghosal (2014b) remedied this by directly considering the distance
between the function in the nonparametric model and a Runge-Kutta (RK)
approximate solution of the ODE while inducing the posterior distribution on
the parameter. They also studied the direct Bayesian method obtained from the
approximate likelihood obtained by the RK4 method. In this paper we extend
these ideas for the higher order ODE model and establish Bernstein-von Mises
theorems for the posterior distribution of the parameter vector for each method
with contraction rate.Comment: arXiv admin note: substantial text overlap with arXiv:1411.116
Adaptive Bayesian procedures using random series priors
We consider a prior for nonparametric Bayesian estimation which uses finite
random series with a random number of terms. The prior is constructed through
distributions on the number of basis functions and the associated coefficients.
We derive a general result on adaptive posterior convergence rates for all
smoothness levels of the function in the true model by constructing an
appropriate "sieve" and applying the general theory of posterior convergence
rates. We apply this general result on several statistical problems such as
signal processing, density estimation, various nonparametric regressions,
classification, spectral density estimation, functional regression etc. The
prior can be viewed as an alternative to the commonly used Gaussian process
prior, but properties of the posterior distribution can be analyzed by
relatively simpler techniques and in many cases allows a simpler approach to
computation without using Markov chain Monte-Carlo (MCMC) methods. A simulation
study is conducted to show that the accuracy of the Bayesian estimators based
on the random series prior and the Gaussian process prior are comparable. We
apply the method on two interesting data sets on functional regression.Comment: arXiv admin note: substantial text overlap with arXiv:1204.423
Posterior consistency of Gaussian process prior for nonparametric binary regression
Consider binary observations whose response probability is an unknown smooth
function of a set of covariates. Suppose that a prior on the response
probability function is induced by a Gaussian process mapped to the unit
interval through a link function. In this paper we study consistency of the
resulting posterior distribution. If the covariance kernel has derivatives up
to a desired order and the bandwidth parameter of the kernel is allowed to take
arbitrarily small values, we show that the posterior distribution is consistent
in the -distance. As an auxiliary result to our proofs, we show that,
under certain conditions, a Gaussian process assigns positive probabilities to
the uniform neighborhoods of a continuous function. This result may be of
independent interest in the literature for small ball probabilities of Gaussian
processes.Comment: Published at http://dx.doi.org/10.1214/009053606000000795 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …